Bradford Distribution (bradford)#

The Bradford distribution is a continuous, one-parameter family on \([0, 1]\) with a logarithmic CDF. It can be viewed as a scaled log-uniform (a.k.a. reciprocal) distribution, making it a convenient model (or prior) when you want a bounded variable that is more concentrated near 0 than a uniform distribution.

Learning goals

  • Understand the definition (PDF/CDF/quantile) and key limiting cases.

  • Derive mean/variance and write numerically stable NumPy implementations.

  • Simulate and visualize the distribution, and fit it with scipy.stats.bradford.

import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

from scipy import optimize, special, stats
import scipy

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

np.set_printoptions(precision=6, suppress=True)
rng = np.random.default_rng(42)

# Reproducibility / environment info
import sys
import plotly

print("python:", sys.version.split()[0])
print("numpy :", np.__version__)
print("scipy :", scipy.__version__)
print("plotly:", plotly.__version__)
python: 3.12.9
numpy : 1.26.2
scipy : 1.15.0
plotly: 6.5.2

1) Title & Classification#

  • Name: Bradford distribution (bradford)

  • Type: continuous

  • Support: \(x \in [0, 1]\) (standard form)

  • Parameter space: one shape parameter \(c > 0\)

SciPy implements a location–scale family: if \(X \sim \text{Bradford}(c)\) on \([0,1]\), then

[ Y = \text{loc} + \text{scale} \cdot X ]

has support \([\text{loc},\, \text{loc}+\text{scale}]\).

2) Intuition & Motivation#

What it models#

A good mental model is:

Take a number that is uniform on a log-scale, then linearly rescale it into \([0,1]\).

Concretely, if

  • \(U \sim \text{Uniform}(0,1)\)

  • \(Y = (1+c)^U\) (so \(Y\) is log-uniform on \([1, 1+c]\))

  • \(X = \dfrac{Y-1}{c}\)

then \(X\) has a Bradford distribution on \([0,1]\).

This creates a monotonically decreasing PDF: small values are more likely, but the variable is still bounded.

Typical real-world use cases#

  • Diminishing returns / “early wins” curves on a normalized domain: e.g. cumulative share of results found after scanning the first fraction of sources.

  • Priors for small probabilities on \([0,1]\) (a simple alternative to a very skewed Beta prior).

  • Simple bounded generative features: sampling a “small-ish” proportion or weight without going unbounded.

Relations to other distributions#

  • As \(c \to 0^+\), Bradford approaches Uniform\((0,1)\).

  • \(1 + cX\) is log-uniform on \([1, 1+c]\).

  • For large \(c\), the PDF behaves like a truncated \(1/x\)-type shape (strong concentration near 0) while remaining integrable on \([0,1]\).

3) Formal Definition#

Let \(c > 0\). The PDF of the Bradford distribution on \([0,1]\) is

[ f(x; c) = \frac{c}{\ln(1+c)} \cdot \frac{1}{1 + cx}, \qquad 0 \le x \le 1. ]

The CDF is

[ F(x; c) = \frac{\ln(1+cx)}{\ln(1+c)}, \qquad 0 \le x \le 1. ]

A very useful consequence is the quantile function (inverse CDF): for \(u \in [0,1]\),

[ Q(u; c) = F^{-1}(u) = \frac{(1+c)^u - 1}{c}. ]

Numerically stable notation#

When implementing, prefer

  • log1p(z) for \(\ln(1+z)\)

  • expm1(z) for \(\exp(z)-1\)

to avoid loss of precision for small \(c\).

def _validate_c(c: float) -> float:
    c = float(c)
    if not np.isfinite(c) or c <= 0:
        raise ValueError(f"c must be finite and > 0, got {c!r}")
    return c


def bradford_pdf(x: np.ndarray, c: float) -> np.ndarray:
    '''Bradford(c) PDF on [0,1].'''
    c = _validate_c(c)
    x = np.asarray(x, dtype=float)
    L = np.log1p(c)
    out = np.zeros_like(x)
    mask = (0.0 <= x) & (x <= 1.0)
    out[mask] = c / (L * (1.0 + c * x[mask]))
    return out


def bradford_cdf(x: np.ndarray, c: float) -> np.ndarray:
    '''Bradford(c) CDF on [0,1].'''
    c = _validate_c(c)
    x = np.asarray(x, dtype=float)
    L = np.log1p(c)
    out = np.zeros_like(x)
    out[x >= 1.0] = 1.0
    mask = (0.0 <= x) & (x < 1.0)
    out[mask] = np.log1p(c * x[mask]) / L
    return out


def bradford_ppf(u: np.ndarray, c: float) -> np.ndarray:
    '''Bradford(c) quantile function (inverse CDF) for u in [0,1].'''
    c = _validate_c(c)
    u = np.asarray(u, dtype=float)
    if np.any((u < 0) | (u > 1)):
        raise ValueError("u must be in [0,1]")

    # Q(u) = ((1+c)^u - 1) / c = expm1(u * log1p(c)) / c
    return np.expm1(u * np.log1p(c)) / c
# Sanity checks: PDF integrates to ~1 and matches SciPy
c = 7.5
x_grid = np.linspace(0, 1, 20001)

pdf_vals = bradford_pdf(x_grid, c)
area = np.trapz(pdf_vals, x_grid)
print("∫ pdf dx ≈", area)

dist = stats.bradford(c)
max_abs_pdf_diff = np.max(np.abs(pdf_vals - dist.pdf(x_grid)))
max_abs_cdf_diff = np.max(np.abs(bradford_cdf(x_grid, c) - dist.cdf(x_grid)))
print("max |pdf - scipy|:", max_abs_pdf_diff)
print("max |cdf - scipy|:", max_abs_cdf_diff)
∫ pdf dx ≈ 1.0000000054000913
max |pdf - scipy|: 8.881784197001252e-16
max |cdf - scipy|: 2.220446049250313e-16

4) Moments & Properties#

Let \(L = \ln(1+c)\).

Mean and variance#

[ \mathbb{E}[X] = \frac{c - \ln(1+c)}{c,\ln(1+c)} = \frac{1}{L} - \frac{1}{c}. ]

[ \mathrm{Var}(X) = \frac{(c+2)\ln(1+c) - 2c}{2c,[\ln(1+c)]^2} = \frac{(c+2)L - 2c}{2cL^2}. ]

Higher moments, skewness, kurtosis#

You can write closed-forms for raw moments \(\mathbb{E}[X^k]\) for integer \(k\) by integrating

[ \mathbb{E}[X^k] = \frac{c}{L} \int_0^1 \frac{x^k}{1+cx} , dx. ]

For \(k=1,2,3,4\) these simplify nicely (see code below). Skewness and kurtosis follow from the first four moments.

MGF / characteristic function#

The MGF exists for all real \(t\) (bounded support). Using the exponential integral \(\operatorname{Ei}\),

[ M(t) = \mathbb{E}[e^{tX}] = \frac{e^{-t/c}}{\ln(1+c)}\left[\operatorname{Ei}\left(\frac{t(1+c)}{c}\right) - \operatorname{Ei}\left(\frac{t}{c}\right)\right]. ]

The characteristic function is \(\varphi(\omega) = M(i\omega)\).

Entropy#

The differential entropy has a compact form:

[ h(X) = -\mathbb{E}[\ln f(X)] = \ln\left(\frac{\ln(1+c)}{c}\right) + \frac{1}{2}\ln(1+c). ]

A neat shortcut uses the log-uniform transformation: \(\ln(1+cX) / \ln(1+c) \sim \text{Uniform}(0,1)\).

def bradford_raw_moments_1_to_4(c: float) -> tuple[float, float, float, float]:
    '''Return (E[X], E[X^2], E[X^3], E[X^4]) for X ~ Bradford(c) on [0,1].'''
    c = _validate_c(c)
    L = np.log1p(c)

    ex1 = 1.0 / L - 1.0 / c
    ex2 = 1.0 / (2.0 * L) - 1.0 / (c * L) + 1.0 / (c**2)
    ex3 = 1.0 / (3.0 * L) - 1.0 / (2.0 * c * L) + 1.0 / (c**2 * L) - 1.0 / (c**3)
    ex4 = (
        1.0 / (4.0 * L)
        - 1.0 / (3.0 * c * L)
        + 1.0 / (2.0 * c**2 * L)
        - 1.0 / (c**3 * L)
        + 1.0 / (c**4)
    )
    return ex1, ex2, ex3, ex4


def bradford_mean_var(c: float) -> tuple[float, float]:
    c = _validate_c(c)
    L = np.log1p(c)
    mean = 1.0 / L - 1.0 / c
    var = ((c + 2.0) * L - 2.0 * c) / (2.0 * c * L**2)
    return mean, var


def bradford_skew_kurtosis_excess(c: float) -> tuple[float, float]:
    '''Return (skewness, excess kurtosis) via the first four raw moments.'''
    ex1, ex2, ex3, ex4 = bradford_raw_moments_1_to_4(c)

    mean = ex1
    var = ex2 - mean**2
    sigma = np.sqrt(var)

    mu3 = ex3 - 3 * mean * ex2 + 2 * mean**3
    mu4 = ex4 - 4 * mean * ex3 + 6 * mean**2 * ex2 - 3 * mean**4

    skew = mu3 / sigma**3
    excess_kurt = mu4 / var**2 - 3.0
    return skew, excess_kurt


def bradford_entropy(c: float) -> float:
    c = _validate_c(c)
    L = np.log1p(c)
    return np.log(L / c) + 0.5 * L


def bradford_mgf(t: np.ndarray, c: float) -> np.ndarray:
    '''MGF M(t) using SciPy's Ei implementation (works for real or complex t).'''
    c = _validate_c(c)
    t = np.asarray(t)
    L = np.log1p(c)

    # Ei(0) = -inf, but the limit M(0) = 1; handle t=0 explicitly.
    out = np.empty_like(t, dtype=np.result_type(t, float))
    mask0 = t == 0
    out[mask0] = 1.0
    mask = ~mask0
    if np.any(mask):
        tt = t[mask]
        out[mask] = np.exp(-tt / c) / L * (special.expi(tt * (1.0 + c) / c) - special.expi(tt / c))
    return out

c = 7.5
mean, var = bradford_mean_var(c)
skew, excess_kurt = bradford_skew_kurtosis_excess(c)

print("mean   :", mean)
print("var    :", var)
print("skew   :", skew)
print("ex.kurt:", excess_kurt)
print("entropy:", bradford_entropy(c))

# Compare to SciPy's built-in stats
m, v, s, k = stats.bradford(c).stats(moments="mvsk")
print("
SciPy mvsk:", float(m), float(v), float(s), float(k))

# Quick check of MGF at t=0 (should be 1)
print("
M(0) ≈", bradford_mgf(0.0, c))
  Cell In[4], line 77
    print("
          ^
SyntaxError: unterminated string literal (detected at line 77)

5) Parameter Interpretation#

The single parameter \(c\) controls how strongly the density piles up near 0.

  • Small \(c\): \(\ln(1+c) \approx c\), so \(f(x;c) \approx 1\) → nearly Uniform\((0,1)\).

  • Large \(c\): the PDF becomes strongly decreasing and more concentrated near 0.

Because the support is fixed, this is a convenient “one-knob” family for bounded, right-skewed behavior.

# Visualize how the PDF and CDF change with c
c_values = [0.2, 1.0, 5.0, 20.0]
x = np.linspace(0, 1, 400)

fig_pdf = go.Figure()
fig_cdf = go.Figure()

for c in c_values:
    fig_pdf.add_trace(go.Scatter(x=x, y=bradford_pdf(x, c), mode="lines", name=f"c={c}"))
    fig_cdf.add_trace(go.Scatter(x=x, y=bradford_cdf(x, c), mode="lines", name=f"c={c}"))

fig_pdf.update_layout(title="Bradford PDF on [0,1]", xaxis_title="x", yaxis_title="f(x)")
fig_cdf.update_layout(title="Bradford CDF on [0,1]", xaxis_title="x", yaxis_title="F(x)")

fig_pdf.show()
fig_cdf.show()
# How mean/variance evolve with c
c_grid = np.logspace(-2, 2, 250)  # 0.01 to 100
means = np.array([bradford_mean_var(c)[0] for c in c_grid])
vars_ = np.array([bradford_mean_var(c)[1] for c in c_grid])

fig = go.Figure()
fig.add_trace(go.Scatter(x=c_grid, y=means, mode="lines", name="mean"))
fig.add_trace(go.Scatter(x=c_grid, y=vars_, mode="lines", name="variance", yaxis="y2"))

fig.update_layout(
    title="Mean and variance vs c",
    xaxis_title="c (log scale)",
    xaxis_type="log",
    yaxis=dict(title="mean"),
    yaxis2=dict(title="variance", overlaying="y", side="right"),
)
fig.show()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[6], line 3
      1 # How mean/variance evolve with c
      2 c_grid = np.logspace(-2, 2, 250)  # 0.01 to 100
----> 3 means = np.array([bradford_mean_var(c)[0] for c in c_grid])
      4 vars_ = np.array([bradford_mean_var(c)[1] for c in c_grid])
      6 fig = go.Figure()

NameError: name 'bradford_mean_var' is not defined

6) Derivations#

Let \(L = \ln(1+c)\) and \(f(x;c)=\dfrac{c}{L(1+cx)}\) for \(x\in[0,1]\).

Expectation#

[ \mathbb{E}[X] = \frac{c}{L}\int_0^1 \frac{x}{1+cx},dx. ]

Use the identity

[ \frac{x}{1+cx} = \frac{1}{c}\left(1 - \frac{1}{1+cx}\right). ]

Then

[ \int_0^1 \frac{x}{1+cx},dx = \frac{1}{c}\left[ x - \frac{1}{c}\ln(1+cx)\right]_{0}^{1} = \frac{1}{c}\left(1 - \frac{\ln(1+c)}{c}\right). ]

Multiplying by \(\frac{c}{L}\) gives

[ \mathbb{E}[X] = \frac{1}{L} - \frac{1}{c}. ]

Variance#

First compute

[ \mathbb{E}[X^2] = \frac{c}{L}\int_0^1 \frac{x^2}{1+cx},dx. ]

Do a small polynomial decomposition:

[ \frac{x^2}{1+cx} = \frac{1}{c}x - \frac{1}{c^2} + \frac{1}{c^2(1+cx)}. ]

Integrate term by term to obtain

[ \mathbb{E}[X^2] = \frac{1}{2L} - \frac{1}{cL} + \frac{1}{c^2}. ]

Finally, \(\mathrm{Var}(X)=\mathbb{E}[X^2] - (\mathbb{E}[X])^2\) simplifies to

[ \mathrm{Var}(X) = \frac{(c+2)L - 2c}{2cL^2}. ]

Likelihood (i.i.d. sample)#

For data \(x_1,\dots,x_n \in [0,1]\), the log-likelihood is

[ \ell(c) = \sum_{i=1}^n \ln f(x_i;c) = n\ln c - n\ln \ln(1+c) - \sum_{i=1}^n \ln(1+c x_i). ]

The score equation \(\partial \ell/\partial c = 0\) has no closed-form solution for \(\hat c\), but it is smooth and easy to solve numerically.

def bradford_loglik(c: float, x: np.ndarray) -> float:
    c = _validate_c(c)
    x = np.asarray(x, dtype=float)
    if np.any((x < 0) | (x > 1)):
        raise ValueError("All x must be in [0,1] for the standard Bradford distribution")

    L = np.log1p(c)
    return x.size * np.log(c) - x.size * np.log(L) - np.sum(np.log1p(c * x))


def bradford_mle_c(x: np.ndarray, c_init: float = 1.0) -> float:
    x = np.asarray(x, dtype=float)

    def nll(log_c: float) -> float:
        c = np.exp(log_c)
        return -bradford_loglik(c, x)

    res = optimize.minimize(nll, x0=np.log(c_init), method="BFGS")
    if not res.success:
        raise RuntimeError(f"MLE optimization failed: {res.message}")
    return float(np.exp(res.x[0]))


# Demonstrate MLE recovery on simulated data
c_true = 8.0
n = 2000
x_sim = bradford_ppf(rng.random(n), c_true)

c_hat = bradford_mle_c(x_sim, c_init=5.0)
print("c_true:", c_true)
print("c_hat :", c_hat)

# SciPy fit: fixing loc=0, scale=1 focuses fit on shape parameter
c_scipy, loc_scipy, scale_scipy = stats.bradford.fit(x_sim, floc=0, fscale=1)
print("c_fit (scipy):", c_scipy)
c_true: 8.0
c_hat : 7.768058113979663
c_fit (scipy): 7.768066406250014
/tmp/ipykernel_1028195/432199472.py:2: DeprecationWarning:

Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)

7) Sampling & Simulation (NumPy-only)#

Because the CDF is explicit and strictly increasing, inverse transform sampling is the natural choice.

From

[ F(x;c) = \frac{\ln(1+cx)}{\ln(1+c)}, ]

set \(U\sim \text{Uniform}(0,1)\) and solve \(U=F(X;c)\):

[ \ln(1+cX) = U,\ln(1+c)\quad\Rightarrow\quad 1+cX = (1+c)^U\quad\Rightarrow\quad X = \frac{(1+c)^U - 1}{c}. ]

Algorithm#

  1. Sample \(U \sim \text{Uniform}(0,1)\).

  2. Return \(X = \text{expm1}(U\,\log1p(c)) / c\).

This is fast, exact (up to floating point), and stable with log1p/expm1.

def bradford_rvs_numpy(c: float, size: int | tuple[int, ...], rng: np.random.Generator) -> np.ndarray:
    c = _validate_c(c)
    u = rng.random(size)
    return np.expm1(u * np.log1p(c)) / c


c = 10.0
samples = bradford_rvs_numpy(c, size=200_000, rng=rng)

mean_theory, var_theory = bradford_mean_var(c)
print("empirical mean:", samples.mean())
print("theory mean   :", mean_theory)
print("empirical var :", samples.var())
print("theory var    :", var_theory)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[8], line 10
      7 c = 10.0
      8 samples = bradford_rvs_numpy(c, size=200_000, rng=rng)
---> 10 mean_theory, var_theory = bradford_mean_var(c)
     11 print("empirical mean:", samples.mean())
     12 print("theory mean   :", mean_theory)

NameError: name 'bradford_mean_var' is not defined

8) Visualization#

We’ll visualize

  • the PDF and CDF (already shown above for multiple \(c\))

  • Monte Carlo samples (histogram + empirical CDF)

and compare to the theoretical curves.

# Histogram (Monte Carlo) + theoretical PDF overlay
c = 10.0
n = 50_000
x = bradford_rvs_numpy(c, size=n, rng=rng)

x_plot = np.linspace(0, 1, 400)

hist = px.histogram(
    x,
    nbins=60,
    histnorm="probability density",
    opacity=0.6,
    title=f"Monte Carlo samples vs PDF (c={c})",
)

hist.add_trace(go.Scatter(x=x_plot, y=bradford_pdf(x_plot, c), mode="lines", name="theoretical pdf"))
hist.update_layout(xaxis_title="x", yaxis_title="density")
hist.show()
# Empirical CDF vs theoretical CDF
c = 10.0
x = bradford_rvs_numpy(c, size=30_000, rng=rng)

x_sorted = np.sort(x)
emp_cdf = np.arange(1, x_sorted.size + 1) / x_sorted.size

fig = go.Figure()
fig.add_trace(go.Scatter(x=x_sorted, y=emp_cdf, mode="lines", name="empirical CDF"))

x_plot = np.linspace(0, 1, 400)
fig.add_trace(go.Scatter(x=x_plot, y=bradford_cdf(x_plot, c), mode="lines", name="theoretical CDF"))

fig.update_layout(title=f"Empirical vs theoretical CDF (c={c})", xaxis_title="x", yaxis_title="CDF")
fig.show()

9) SciPy Integration (scipy.stats.bradford)#

SciPy’s stats.bradford implements the same one-parameter family, plus optional loc and scale.

Common methods:

  • pdf, cdf, ppf

  • rvs for sampling

  • fit for MLE

When your data are already on \([0,1]\), it’s often best to fix loc=0 and scale=1 during fitting.

c = 4.0
x = np.linspace(0, 1, 6)

dist = stats.bradford(c)
print("pdf:", dist.pdf(x))
print("cdf:", dist.cdf(x))
print("rvs:", dist.rvs(size=5, random_state=rng))

# Fit on synthetic data (standard support): fix loc and scale
x_sim = dist.rvs(size=5000, random_state=rng)

c_hat, loc_hat, scale_hat = stats.bradford.fit(x_sim, floc=0, fscale=1)
print("
true c:", c)
print("fit  c:", c_hat)
  Cell In[11], line 13
    print("
          ^
SyntaxError: unterminated string literal (detected at line 13)

10) Statistical Use Cases#

A) Hypothesis testing (likelihood ratio test)#

For a parametric family \(\{f(x;c)\}\), a common test is

  • \(H_0: c = c_0\) vs \(H_1: c \ne c_0\).

Under regularity conditions, the likelihood ratio statistic

[ \Lambda = 2,[\ell(\hat c) - \ell(c_0)] ]

is approximately \(\chi^2_1\) for large \(n\).

B) Bayesian modeling#

Because Bradford is a flexible, right-skewed distribution on \([0,1]\), it can serve as a prior for a probability parameter \(p\) when you expect \(p\) to be small.

Below, we do a simple grid posterior for a Binomial likelihood without needing any probabilistic programming library.

C) Generative modeling#

In a generative pipeline, Bradford is a cheap way to sample a bounded “activation/proportion/weight” variable that is typically small but occasionally larger. With loc/scale, you can place the support on any finite interval.

# A) Likelihood ratio test example: H0: c = c0
from scipy.stats import chi2

c0 = 6.0
c_true = 10.0
n = 3000

x = stats.bradford(c_true).rvs(size=n, random_state=rng)

c_hat = bradford_mle_c(x, c_init=c0)
ll_hat = bradford_loglik(c_hat, x)
ll_0 = bradford_loglik(c0, x)

lrt = 2 * (ll_hat - ll_0)
p_value = 1 - chi2.cdf(lrt, df=1)

print("c0   :", c0)
print("c_hat:", c_hat)
print("LRT statistic:", lrt)
print("p-value ~", p_value)
/tmp/ipykernel_1028195/432199472.py:2: DeprecationWarning:

Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)

/tmp/ipykernel_1028195/432199472.py:2: DeprecationWarning:

Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[12], line 10
      6 n = 3000
      8 x = stats.bradford(c_true).rvs(size=n, random_state=rng)
---> 10 c_hat = bradford_mle_c(x, c_init=c0)
     11 ll_hat = bradford_loglik(c_hat, x)
     12 ll_0 = bradford_loglik(c0, x)

Cell In[7], line 20, in bradford_mle_c(x, c_init)
     18 res = optimize.minimize(nll, x0=np.log(c_init), method="BFGS")
     19 if not res.success:
---> 20     raise RuntimeError(f"MLE optimization failed: {res.message}")
     21 return float(np.exp(res.x[0]))

RuntimeError: MLE optimization failed: Desired error not necessarily achieved due to precision loss.
# B) Bayesian modeling: Bradford prior for p in a Binomial model
# Observations: k successes out of n
n = 50
k = 3

c_prior = 12.0  # prior mass concentrated near 0

p_grid = np.linspace(1e-6, 1 - 1e-6, 4000)
prior = bradford_pdf(p_grid, c_prior)

# Binomial likelihood up to proportionality: p^k (1-p)^(n-k)
log_like = k * np.log(p_grid) + (n - k) * np.log1p(-p_grid)
like = np.exp(log_like - log_like.max())

posterior_unnorm = prior * like
posterior = posterior_unnorm / np.trapz(posterior_unnorm, p_grid)

post_mean = np.trapz(p_grid * posterior, p_grid)
post_map = p_grid[np.argmax(posterior)]

print("posterior mean:", post_mean)
print("posterior MAP :", post_map)

fig = go.Figure()
fig.add_trace(go.Scatter(x=p_grid, y=prior / np.trapz(prior, p_grid), mode="lines", name="prior (normalized)"))
fig.add_trace(go.Scatter(x=p_grid, y=posterior, mode="lines", name="posterior"))
fig.update_layout(
    title=f"Bradford prior (c={c_prior}) updated by Binomial(n={n}, k={k})",
    xaxis_title="p",
    yaxis_title="density",
)
fig.show()
posterior mean: 0.06932741764435588
posterior MAP : 0.05276408527131783
# C) Generative modeling: placing Bradford on an arbitrary interval [a, b]
a, b = 2.0, 7.0
c = 8.0

# Sample X on [0,1], then map to [a,b]
x = stats.bradford(c).rvs(size=50_000, random_state=rng)
y = a + (b - a) * x

fig = px.histogram(y, nbins=60, histnorm="probability density", title=f"Bradford(c={c}) scaled to [{a}, {b}]")
fig.update_layout(xaxis_title="y", yaxis_title="density")
fig.show()

11) Pitfalls#

  • Parameter validity: the Bradford shape parameter must satisfy \(c>0\). (Uniform\((0,1)\) appears as the limit \(c\to 0^+\).)

  • Data support: for the standard form, you must have \(x \in [0,1]\). If your data live on \([a,b]\), consider the location–scale form.

  • Numerical stability:

    • Use log1p(c) instead of log(1+c).

    • Use log1p(c*x) instead of log(1+c*x).

    • Use expm1(u*log1p(c)) when sampling.

  • Fitting quirks: if you let loc and scale float freely, SciPy may fit a shifted/scaled Bradford that no longer matches your intended support. Fix them when appropriate.

  • Name collision: “Bradford distribution” is sometimes discussed alongside Bradford’s law in bibliometrics; be clear whether you mean the continuous distribution implemented by SciPy.

12) Summary#

  • Bradford is a continuous distribution on \([0,1]\) with PDF \(\propto (1+cx)^{-1}\) and CDF \(\propto \ln(1+cx)\).

  • It is equivalent to a scaled log-uniform: \(1+cX\) is log-uniform on \([1,1+c]\).

  • The quantile function is explicit, enabling simple NumPy-only inverse-CDF sampling.

  • Mean/variance/entropy have clean formulas; skewness/kurtosis can be computed from low-order moments.

  • scipy.stats.bradford provides a ready-made implementation with pdf, cdf, rvs, and fit.